NOVEL METHODS FOR OBTAINING, IDENTIFYING AND APPLYING 

NUCLEIC ACID SEQUENCES AND (POLY)PEPTIDES WHICH 
INCREASE THE EXPRESSION YIELDS OF PERIPLASMS PROTEINS 

IN FUNCTIONAL FORM 

The present invention relates to a method for obtaining nucleic acid sequences 
encoding (poly)peptides which increase the expression yields of periplasmic proteins 
in functional form upon co-expression of said (poly)peptides and said periplasmic 
proteins. The invention also provides a method for the identification of said 
(poly)peptides. Furthermore, the present invention relates to a method for increasing 
the expression yields of periplasmic proteins in functional form by co-expressing 
(poly)peptides, for example Skp, FkpA, or a homolog of Skp or FkpA, in bacteria. 

Expression in the bacterial periplasm is the most convenient route to express 
foreign recombinant proteins, especially proteins containing disulphides, since the 
bacterial disulphide forming and isomerization machinery (Bardwell, 1994) can be 
utilised. Nevertheless, not all proteins can be produced with high functional yield in 
the E. coli periplasm, and no general method for optimizing the expression in 
functional form of poorly folding proteins secreted into the periplasm exists. 

Another field where the correct folding of proteins in the periplasm is of crucial 
importance is in phage display. This method has been used over the last decade 
to screen libraries not only of peptides but also of a large variety of proteins (Dunn, 
1996; McGregor, 1996). These displayed proteins are fused to a phage coat 
protein, e.g. to the N-terminus of the whole gene-3-protein (g3p) or to its C- 
terminal domain. These proteins therefore fold in the periplasm, while remaining 
anchored to the inner membrane by the C-terminal hydrophobic extension of g3p, 
before being incorporated into the phage coat. Therefore, the g3p fusion-proteins 
will almost certainly fold in the same environment and use the same machinery as 
periplasmically expressed proteins. Poorly folding proteins will most likely be lost 
over multiple screening rounds irrespective of their binding properties. 



Co-expression of the cytoplasmic chaperonins GroEL and GroES during M13 
phage assembly for Fab display were reported to lead to a 200fold increase in 
phage titer (Soderlind, 1993). However, the relative amount of functional antibody 
fragments being displayed by the phage particles was not affected. It was 
speculated that GroEL/GroES assist in phage packing and assembly, although 
these steps take place in the periplasm. A general method for increasing the 
functional display of proteins on phage is not yet available. 

Consequently, there has been great interest in the question of the existence of 
periplasmic chaperones. However, unlike the well-characterized cytoplasmic 
machinery of E. co//, DnaK/DnaJ/GrpE and GroEL/GroES and possibly others 
(Makrides, 1996; Martin & Hartl, 1997; Buchner, 1996; EP 0 774 512 A3), the 
chaperone composition of the periplasm has remained poorly understood (Wall & 
Pluckthun, 1995; Missiakas et al., 1996). While progress in elucidating the signal 
transduction of periplasmic stress has been made (Missiakas & Raina, 1997), the 
ultimate effector molecules controlling periplasmic folding have remained obscure, 
although some proteins, such as FkpA or SurA, were believed to act as general 
periplasmic folding catalysts (Missiakas et al., 1996). FkpA has first been 
described as very similar to the eukaryotic FK506 binding proteins (FKBPs) (Home 
and Young, 1995), a class of well-characterized peptidyl-prolyl cis-trans 
isomerases (PPIs), which have been shown to be inhibited by the macrolipide 
FK506. Missiakas and co-workers showed, that the mature FkpA is located in the 
periplasm and assayed its activity (Missiakas et al., 1996). The estimated Kcat/Km 
of the cis-trans isomeration of the Ala-Pro peptidyl-prolyl bond using succinyl-Ala- 
Ala-Pro-Phe-4-nitroanilide as substrate was 90mM-1s-1. FkpA is directly regulated 
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by a , which binds in its promoter region (Danese and Silhavy, 1997). The a 
pathway is induced by heat stress and conditions, that lead to misfolding or 
misassembly of outer membrane proteins (OMPs), such as over-expression of 
OMPs or inactivation of the surA gene. 

Another protein which has been discussed in the context of periplasmic folding and 
protein transport is Skp. Skp is a very basic protein, which at first led to its 



misassignment as a DNA-binding protein (Hoick et al., 1987), later as an outer 
membrane associated protein (Hirvas et al., 1990; Koski et aL, 1990; Koski et al., 
1989), and a variety of synonyms (OmpH, HlpA) witness its unclear function. 
Homologs have been found in Salmonella typhimurium (Koski et al., 1990; Koski et 
al., 1989), Yersinia enterocolitica (Hirvas et al., 1991), Yersinia pseudotuberculosis 
(Vuorio et al., 1991), Haemophilus influenzae (Fleischmann et al., 1995) and 
Pasteurella multocida (Delamarche et al., 1995). Muller and co-workers (Thome et 
al., 1990) showed that this protein stimulates the in vitro import of E. coli proteins 
into membrane vesicles and subsequently established its periplasmic location 
(Thome & Muller, 1991), consistent with its soluble nature and the presence of a 
signal sequence. More recently, it was proposed to be involved in the transport of 
outer membrane proteins (Chen & Henning, 1996), and when its promoter region 
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was interrupted by a Tn10 transposon, the extreme heat shock factor a (a ) 
dependent response was induced (Missiakas et al., 1996). However, it remained 
unclear whether this is an effect of the absence of Skp or a polar effect on other 
proteins located downstream of s/cp. The heat shock response was probably 
induced indirectly via a change in the concentration of outer membrane proteins, 
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which is known (Missiakas et al., 1996) to induce a (a ). 

However, attempts to increase the expression of antibody fragments in functional 
form by over-expressing E. coli disulphide isomerase DsbA and/or proline cis-trans 
isomerase PPIase A did not significantly change the folding limit (Knappik et al., 
1993). It was concluded that aggregation steps in the periplasm compete with 
periplasmic folding, and that they may occur before disulphide formation and/or 
proline cis-trans isomerization take place and be independent of their extent. 

In summary, no protein has up to now been identified, which unambiguously acts 
as a periplasmic chaperone and which could be used to optimize the expression 
yield of a periplasmic protein in functional form. 

Thus, the technical problem underlying the present invention is to identify factors 
which increase the expression yield of periplasmic proteins in functional form in 
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bacteria and to apply these factors to the optimization of expression of periplasmic 
proteins. The solution to the above technical problem is achieved by providing the 
embodiments characterized in the claims. Accordingly, the present invention allows 
to identify and to apply nucleic acid sequences encoding (poly)peptides which 
increase the expression yield of periplasmic proteins in functional form, and/or to 
identify and apply the (poly)peptides. The technical approach of the present 
invention, i.e. the co-expression of a collection of (poly)peptides with said periplasmic 
protein in a collection of host cells to screen or select for such nucleic acid 
sequences and/or (poly)peptides is neither provided nor suggested by the prior art. 

Thus, the present invention relates to a method for obtaining a nucleic acid sequence 
comprising a (poly)peptide coding sequence, which increases the expression yield of 
a periplasmic protein in functional form in bacteria upon co-expression of said 
periplasmic protein and said (poly)peptide, comprising the steps of: 

(a) providing a collection of host cells wherein each cell contains 

(i) a first nucleic acid sequence out of a collection of nucleic acid 
sequences, and 

(ii) a second nucleic acid sequence encoding said periplasmic protein; 

(b) causing or allowing expression of 

(i) (poly)peptides expressible from said collection of nucleic acid 
sequences, and 

(ii) said periplasmic protein expressible from said second nucleic acid 
sequence; 

(c) screening or selecting for a host cell expressing said periplasmic protein 
with increased functional yield; 

(d) optionally, repeating step (c) one or more times; 

(e) obtaining said first nucleic acid sequence contained in said host cell. 

The term "obtaining a nucleic acid sequence" as used herein includes the at least 
partial identification of the nucleic acid molecule e.g. by sequencing and/or collecting 
the nucleic acid molecules by biochemical techniques, for example, comprised in a 
vector. 



In the context of the present invention, the term "(poly)peptide" relates to molecules 
consisting of one or more chains of multiple, i. e. two or more, amino acids linked via 
peptide bonds. 

The term "protein" refers to (poly)peptides where at least part of the (poly)peptide has 
or is able to acquire a defined three-dimensional arrangement by forming secondary, 
tertiary, or quaternary structures within and/or between its (poly)peptide chain(s). 
This definition comprises proteins such as naturally occurring or at least partially 
artificial proteins, as well as fragments or domains of whole proteins, as long as 
these fragments or domains have a defined three-dimensional arrangement as 
described above. 

The term "periplasmic protein" relates to proteins which, after biosynthesis in the 
cytoplasm, are transported across the inner membrane into the periplasm. This 
definition comprises proteins which remain in soluble or associated form in the 
periplasm, which are inserted in the inner or outer membrane, which are further 
secreted into the medium or which are assembled into complex structures such as 
filamentous phages particles which are then secreted. The periplasmic proteins will 
normally, but not necessarily, have at least a transport signal which directs the 
protein to the periplasm. 

The term "periplasmic protein in functional form" relates to a periplasmic protein, 
which has a defined function, and which folds during and after expression in a way 
which leads to a defined three-dimensional arrangement required for the protein to 
be functional. A "defined function" according to the present invention is any feature of 
the protein which depends on the correctly folded three-dimensional arrangement, 
and which can be detected or determined. This comprises functions such as 
enzymatic activity or binding to a target or binding partner, such as in the case of 
receptor/ligand or antibody/antigen pairs. In addition, in the context of the present 
invention, said "feature" referred to hereinabove may be the presence of the correctly 
folded three-dimensional arrangement itself, detected or determined, for example, by 
an antibody recognizing the correctly assembled three-dimensional arrangement of 
the protein, or by measuring physico-chemical properties such as fluorescence or a- 
helix content in fluorescence or CD spectra, respectively. 
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The term "expression yield of a periplasmic protein in functional form" relates to the 
amount of a periplasmic protein being produced in functional form on expression. 
The term "(poly)peptides expressible from said first nucleic acid sequences" relates 
to (poly)peptides for which open reading frames (ORFs) exist on said first nucleic 
acid sequences and where preferably the operator elements necessary for 
expression are present on the corresponding vectors comprising said nucleic acid 
sequences. In the case that said nucleic acid sequences comprise fragments of 
genomic DNA, more than one ORF may be comprised in anyone of said nucleic acid 
sequences. 

The term "functional yield" relates to the amount of said periplasmic protein being 
produced in functional form. 

Methods of designing, creating or obtaining nucleic acid sequences for expression, of 
constructing appropriate vectors, inserting nucleic acid sequences into vectors, 
choosing appropriate host cells, introducing vectors into host cells, causing or 
allowing expression of (poly)peptides or protein, isolating nucleic acids from host 
cells or identifying nucleic acid sequences and corresponding protein sequences are 
standard methods (Sambrook et al., 1989) which are well known to anyone of 
ordinary skill in the art. 

In a preferred embodiment, the method of the present invention further comprises the 
step of identifying a (poly)peptide coding sequence comprised in said first nucleic 
acid sequence. 

The term "identifying a (poly)peptide coding sequence comprised in said first nucleic 
acid sequence" relates to the situation referred to hereinabove, where more than one 
ORF is present in said first nucleic acid sequence. When more than one ORF is 
found, the identifcation optionally further comprises the analysis of individual ORFs 
and, if necessary, further testing such as repeating steps (a) to (e) with a set of 
nucleic acid sequences separately representing the individual ORFs. Said further 
testing can be performed by anyone of ordinary skill in the art. 



In a futher preferred embodiment, said periplasmic protein is not expressible, or in 
very low yields, in functional form when expressed under standard conditions, i. e. 
without the co-expression of said (poly)peptides. 

In another embodiment, the present invention relates to a method, wherein said 
periplasmic protein is a resistance marker, a nutritional marker, a reporter protein, a 
transactivator of transcription of marker genes or reporter genes, or a protein binding 
to a target. As has been stated hereinabove in step (c), the functional yield is 
determined by screening or selecting for an increase in protein function. 
If the protein is a periplasmic resistance marker such as fJ-lactamase or zeocin 
causing resistance to a certain antibiotic when functionally present in the periplasm, a 
selection is possible by culturing the host cells in the presence of said antibiotic. Host 
cells expressing the marker in functional form will be selected for. 
If the protein is a periplasmic nutritional marker such as maltose-binding protein or an 
amino-acid-binding protein, a selection is possible by using auxotrophic host cells 
and by culturing the cells in the presence of maltose, or the amino acid, respectively. 
Host cells expressing the marker in functional from will be selected for. 
If the protein is a periplasmic reporter protein such as alkaline phosphatase, a 
screening is possible by culturing the host cells in the presence of the corresponding 
substrate resulting in a colour reaction. Host cells expressing the reporter protein in 
functional form will be selected for. 

If the protein is a secreted protein having enzymatic activity or binding to a target, a 
screening of the supernatant of individual cell cultures or of a collection of host cells 
on a plate can be performed by adding the appropriate substrate or target, 
respectively, to the medium and measuring or determining the amount of functional 
protein being secreted. 

It will be possible for a person of ordinary skill in the art, without undue burden, to 
identify and adapt existing screening or selection protocols, e.g. based on the various 
ELISA formats known, to arrive at protocols which are suitable for the indidual 
proteins and the corresponding function to be screened of selected for. 
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In a further preferred embodiment, said first nucleic acid sequence is or is derived 
from genomic DNA or mRNA of an organism, or cDNA. 

Further preferred is a method, wherein said genomic DNA is randomly fragmented. 
Genomic DNA can be fragmented by use of restriction enzymes or DNA cleaving 
enzymes, chemical cleavage, mechanical shearing or sonification. These are 
standard procedures well known to anyone of ordinary skill in the art (Sambrook et 
al., 1989). 

In a yet further preferred embodiment of the present invention, said first nucleic acid 
sequence comprises an at least partially randomized sequence. 
Such at least partially randomized sequences can be generated in various ways well 
known to the practitioner in the field, e.g. by random DNA syntheses using mixtures 
of mononucleotides or trinucleotides (Virnekas et al., 1994). There are numerous 
examples of collections of nucleic acid sequences encoding random peptide or 
antibody libraries which could be used in accordance with the present invention. 

In a further preferred embodiment, the present invention relates to a method, wherein 

(a) said first nucleic acid sequence is comprised in a vector which can be 
packaged in a filamentous phage particle, and 

(b) said periplasmic protein is a fusion protein of at least part of a filamentous 
phage coat protein and a further protein; 

and wherein in the course of said expression a collection of filamentous phage 
particles displaying said further protein is produced from said collection of host cells. 

The term "filamentous phage particles displaying said further protein" refers to 
particles prepared by the phage display method which has been developed and used 
extensively in the past 10 years. In said method, a foreign (poly)peptide or protein is 
genetically fused to a coat protein of a phage, in most cases of a filamentous phage 
such as M13, f1 of fd, whereby said phage displays said foreign (poly)peptide or 
protein at its surface. Many important aspects of phage display are summarized in 
various publications (e.g. Kay et al., 1996). 



In one further embodiment of the present invention, the vector wherein said first 
nucleic acid sequence is comprised is a phage vector or a phagemid vector. 
In the latter case, a helper phage will be used to supply phage proteins not encoded 
on the phagemid vector. 

In another embodiment of the present invention, the phage coat protein is the gVlp, 
gVlllp or preferably glllp. 

In a preferred embodiment of the present invention, binding of the displayed protein 
to a cognate binding partner is screened or selected for. 

If the protein is an antibody, the cognate binding partner is the corresponding antigen 
(and vice versa). In the case of a receptor, the cognate binding partner is its ligand 
(and vice versa). 

The particular advantage of this embodiment of the method of the present invention 
is that rare events leading to an increase in functional yield can be selected for since 
the selected phage particles can be used for infection of host cells and can thus be 
amplified. 

In yet another embodiment, said screening or selection is for activity of the displayed 
further protein. 

If the activity is an enzymatic activity, the supernatant of individual host cell cultures 
can be used to assay for the enzymatic activity. 

In a still further embodiment, said further protein comprises at least a domain of the 
immunoglobulin superfamily, and preferably of the immunoglobulin family. 
In the context of the present invention, the term immunoglobulin superfamily (IgSF) 
refers to a family of proteins which are characterized by having at least a domain with 
the immunoglobulin fold, said superfamiliy comprising the immunoglobulins or 
antibodies, and various other proteins such as T-cell receptors or integrins. 
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In a most preferred embodiment, said further protein is an immunoglobulin fragment 
taken from the list of Fv, scFv, disulphide-linked Fv t and Fab fragments. 
In this context, the term "Fv" refers to a fragment comprising the VL (variable light) 
and VH (variable heavy) portions of the antibody molecule, a "single-chain Fv" is a 
fragment, in which the VL and VH chains are joined, in either a VL-VH, or VH-VL 
orientation, by a peptide linker. A "disulphide-linked Fv" is a fragment stabilized by an 
inter-domain disulphide bond. This is a structure which can be made by engineering 
into each chain a single cysteine residue, wherein said cysteine residues from two 
chains become linked through oxidation to form a disulphide. The term "Fab" refers to 
a complex comprising the VL-CL (variable and constant light) and VH-CH1 (variable 
and first constant heavy) portions of the antibody molecule. 

In yet a further preferred embodiment, the invention relates to the method wherein 
said first and second nucleic acid are encoded on the same or on different vectors. 

In a still further embodiment, the present invention relates to a method for identifying 
a (poly)peptide which increases the expression yield of a periplasmic protein in 
functional form in bacteria upon co-expression of said periplasmic protein and said 
(poly)peptide, comprising the steps of: 

(a) identifying a nucleic acid sequence or a (poly)peptide coding sequence 
according to a method of the invention as outlined hereinabove, and 

(b) deducing a (poly)peptide therefrom. 

The deduction of a (poly)peptide can be achieved by translating the (poly)peptide 
encoding sequence into an amino acid sequence. By comparing the deduced 
(poly)peptide sequence with published protein sequences, or by comparing the 
(poly)peptide coding sequence identified as described above with published nucleic 
acid sequences, larger (poly)peptides, or (poly)peptide coding sequences, 
respectively, can be deduced and identified in cases where said first nucleic acid 
sequence did not comprise the full-length nucleic acid coding sequence of a protein. 
In addition to the method described hereinabove, the (poly)peptide may be identified 
directly by known methods from the host cells screened or selected for. For example, 
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said (poly)peptide may be expressed as a fusion with a detection or labelling tag. The 
tagged (poly)peptide may be isolated and identified by amino acid sequencing. 

In a most preferred embodiment, the present invention relates to a method for 
increasing the expression of a periplasmic protein in functional form in a bacterial 
host cell, characterized by co-expressing said periplasmic protein and a (poly)peptide 
identified by a method according to the the present invention. 
Preferably, said bacterial host cells are E. coli cells. 

In a further preferred embodiment, said periplasmic protein is not expressible, or in 
very low yields, in functional form when expressed under standard conditions, i. e. 
without the co-expression of said (poly)peptides. 

In a yet further preferred embodiment, said periplasmic protein is a member of a 
collection of periplasmic proteins expressed in a collection of host cells. 
Several methods such as the phage display technology referred to hereinabove 
provide libraries of proteins for screening or selection procedures. However, the 
success of the procedures is limited by differences in expression yields of functional 
library members. For example, in the case antibody fragments, it is known that the 
expression yields of fragments in functional form vary to a large extent. A high 
percentage of fragments comprised in antibody fragment libraries derived from 
immunoglobulin repertoires is found not to be expressible, or in very low yield when 
expressed under standard conditions, i. e. without the co-expression of said 
(poly)peptides. 

When expressing periplasmic proteins with yet unknown biological function, or a 
collection of periplasmic proteins for the identifcation or a member with a certain 
property (e. g. when expressing an antibody fragment library with the goal to identify 
a fragment which binds to a pre-defined target), the term "expression ... in functional 
form" refers to structural features rather than to a defined biological function. In that 
context, a protein can be called "functional" when it folds into a three-dimensional 
arrangement representative for that kind of proteins. For example, when expressing a 
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collection of antibody molecules of fragments thereof, a "expression ... in functional 
form" is achieved when said molecules of fragments are expressed in a correctly 
folded form, the so-called immunoglobulin fold, since correct folding of an antibody 
binding site is a prerequisite for its function, i.e. the binding to a target. 

In a further preferred embodiment, said (poly)peptide is the E. coli protein Skp or a 
homolog thereof. 

In a further preferred embodiment, said (poly)peptide is the E coli protein FkpA or a 
homolog thereof. 

Proteins are termed homologous if the percentage of the sum of identical and/or 
similar residues exceeds a defined threshold. This threshold is commonly regarded 
by those skilled in the art as being exceeded when at least 15% of the amino acids in 
the aligned genes are identical, and at least 30% are similar. Similarity in that context 
refers to the physico-chemical properties of the amino acids, such as e.g. size, 
polarity, or charge. 

Proteins which are homologous to Skp are known from organisms such as 
Salmonella typhimurium (Koski et al., 1990; Koski et al., 1989), Yersinia 
enterocolitica (Hirvas et al., 1991), Yersinia pseudotuberculosis (Vuorio et al. t 1991), 
Haemophilus influenzae (Fleischmann et al., 1995) and Pasteurella multocida 
(Delamarche et al., 1995). 

Proteins which are homologous to FkpA are present e.g. in many pathogenic bacteria 
(Home and Young, 1995). In Legionella pneumophila the corresponding protein 
showing PPI activity is called MipA. 

In a yet further preferred embodiment, the invention relates to a method wherein said 
periplasmic protein is a fusion protein of at least part of a filamentous phage coat 
protein and a further protein. 
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Still further preferred is a method wherein said further protein comprises at least a 
domain of the immunoglobulin superfamily, and preferably of the immunoglobulin 
family. 

Most preferably, the invention relates to a method wherein the further protein is an 
immunoglobulin fragment taken from the list of Fv, scFv, disulphide-linked Fv, and 
Fab fragment. 

In yet a further preferred embodiment, the invention relates to the method wherein 
the nucleic acid sequence encoding said (poly)peptide, preferably Skp, FkpA, or a 
homolog of Skp or FkpA, and the gene encoding said periplasmic protein are 
encoded on the same or on different vectors, or wherein the nucleic acid sequence 
encoding the (poly)peptide, preferably Skp, FkpA, or a homolog of Skp or FkpA, is 
integrated in the genome of the bacterial host. 

FIGURE CAPTIONS 

Figure 1. Selection scheme. A. Principle of selection. An E. coli genomic library is 
co-expressed with a scFv-fragment, fused to g3p. While the antibody is the same 
throughout, its folding yield varies depending on the co-expressed factor. This 
factor is not displayed on the phage, but expressed in the host cell producing the 
phage, which in the case of an useful factor leads to better "quality" scFv 
fragments displayed. Since the gene for the factor is encoded on the phage, its 
information becomes enriched by phage panning on antigen. B. Phagemid vector 
used for library construction. 

Figure 2. Analysis of phagemid pools after different panning rounds. For each 
round of phage proliferation, phagemids were prepared from cells harvested from 
overnight cultures. The phagemid pools were analyzed by restriction digest with 
Notl. M: Pstl digested A.-DNA as molecular weight marker, lane 1 to 7: phagemids 
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from infected cells after the panning round, lane 0: phagemids before the first 
panning round. 

Figure 3. Schematic representation of the 952 bp insert enriched by phage display 
and panning. Yaet is the product of an ORF of 810 aa of unknown function. It 
shows 66.2% similarity to amino acids of the protective surface antigen D15 of 
Haemophilus influenzae and those of other bacteria (see Example 1.3.). The gene 
IpxD codes for UDP-3-0-[3-hydroxymyristoyl]-glucosamine-N-acyltransferase. The 
only complete ORF found on this insert is the gene skp. Note that the Sau3AI 
fragment obtained is the smallest one which contains the full expression unit of 
Skp. SD, Shine-Dalgarno sequence; p, promoter region, predicted by neural 
network analysis (Reese, 1994). 

Figure 4. Antigen-binding ELISAs of phages grown with or without over-expressed 
Skp, displaying the scFv fragments 4-4-20 (Nieba et al., 1997), 4D5Flu (Jung and 
Pluckthun, 1997), FITC-E2 (Vaughan et al., 1996; Krebber et al., 1997a) and 
ABPC48-CH22S (Proba et at., 1997; Proba et al.; 1998). Phages were purified by 
CsCI gradients as described in Example 1.2.. Phages grown in the presence of 
over-expressed Skp reach higher antigen-binding ELISA signals compared to 
phages grown without over-expressed Skp. Inhibition with soluble antigen shows 
that binding is specific. 

Figure 5. Phage blot. Phage carrying g3p-fusion of the scFv fragments 4-4-20 
(Nieba et al., 1997), 4D5Flu (Jung and Pluckthun, 1997), FITC-E2 (Vaughan et al., 
1996; Krebber et al., 1997a), McPC603-H11 (Knappik and Pluckthun, 1995), 
ABPC48-CH22S (Proba et al., 1997; Proba et al.; 1998) and AL214 were grown 
with or without over-expressed Skp. Phages were purified by CsCI gradients as 
described in Example 1.2.. In the presence of over-expressed Skp, more fusion 
protein is incorporated into the phages than in the absence of Skp on the plasmid. 
Helper phage VCS M13 (Stratagene) was loaded as size reference for g3p wt. 
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Figure 6. Crude extract ELISA. From E. coli JM83 expressing the soluble scFv 
fragments 4-4-20 (Nieba et aL, 1997) and ABPC48-CH22S (Proba et al., 1997; 
Proba et al.; 1998) with or without over-expressed Skp crude extracts were 
prepared as described in Example 1.2.. ScFv fragments produced in the presence 
of Skp give higher antigen binding ELISA signals than without skp on the plasmid. 
Inhibition with soluble antigen shows that binding is specific. 

Figure 7. Analysis of phagemid pools after different panning rounds. For each 
round of phage proliferation, phagemids were prepared from cells harvested from 
overnight cultures. The phagemid pools were analyzed by restriction digest with 
Notl. M: Pstl digested X-DNA as molecular weight marker, lane 1 to 7: phagemids 
from infected cells after the panning round, lane 0: phagemids before the first 
panning round. 

Figure 8. Schematic representations of the 1629 bp and 1987 inserts enriched by 
phage display and panning. YheO is a protein with strong similarity to H. 
influenzae HI0575. FkpA is a peptidyl-prolyl cis-trans isomerase. SlyX is a protein 
with yet unknown function. SlyD is a peptidyl-prolyl cis-trans isomerase. YheP is a 
protein with yet unknown function. 

Figure 9. Antigen-binding ELISAs of phages grown with or without over-expressed 
Skp, FkpA, SlyX, FkpA+SlyX, and FkpA+Skp, displaying the scFv ABPC48- 
C(H22)S (Proba et aL, 1997; Proba et al., 1998). Phages were purified by CsCI 
gradients as described in Example 1.2.. Phages grown in the presence of over- 
expressed Skp reach higher antigen-binding ELISA signals compared to phages 
grown without over-expressed Skp. Inhibition with soluble antigen shows that 
binding is specific. 
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The examples illustrate the invention: 

EXAMPLES 
Example 1 : Identification and application of Skp 
1.1. Introduction 

We have made use of the hypothesis, that g3p fusion-proteins might fold in the 
same environment and use the same machinery as periplasmically expressed 
proteins, in a search for cellular factors which might aid both the folding of 
periplasmic proteins and proteins displayed on phage. We have used a very poorly 
folding single-chain Fv fragment of the antibody 4-4-20 (Bedzyk et al., 1990), 
(Whitlow et al., 1995), specific for fluorescein, as a model system. Previous work 
(Nieba et al., 1997) showed that this scFv strongly aggregates in the bacterial 
periplasm, even though the same protein, once in the native state, is very soluble 
and stable. This indicates that not the final product limits the yield, but that the 
folding pathway branches off to aggregates. For this and similar cases, the folding 
yield is a kinetic and not a thermodynamic problem, and can thus potentially be 
helped by cellular factors. 

We wished to identify such factors without any prejudice concerning whether they 
are membrane-bound, periplasmic, or even cytoplasmic proteins. We also wanted 
to be able to find any factors with might affect the total yield of the product without 
directly influencing folding. We therefore developed a selection system making use 
of phage display (Fig. 1A). The poorly folding scFv fragment was displayed as a 
fusion protein with g3p, and a library of E coli proteins was co-expressed on the 
same phagemid. We reasoned that, if a particular E coli cell produces a beneficial 
factor encoded on this phagemid, this cell will give rise to phages which 
outcompete the other phages, because a higher fraction of the phages will display 
correctly folded scFv. Thus, while the displayed scFv is genetically identical on all 
phages, this method selects for the effect of the additional factor encoded on the 
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same phagemid, even though the factor itself is not displayed. This factor 
improves the "quality" of the scFv, namely the percentage of correctly folded 
molecules displayed on the phage, by acting on the host cell which produces the 
particular phage. 

Using this strategy (Fig. 1A), a gene from E. coli was enriched, coding for the 
periplasmic protein Skp, which had been suspected previously to have a role in 
folding or transport of outer membrane proteins (Chen & Henning, 1996). 

1.2. Experimental Protocols 

Construction of genomic library. A Notl site was inserted in the phage display 
vector pAK100 (Krebber et al., 1997) at position 5656. A polylinker was inserted as 
an oligonucleotide cassette into this Notl site. The gel-purified Sfil fragment 
encoding the scFv fragment of the anti-fluorescein antibody 4-4-20 (Bedzyk et al., 
1990) was ligated in this vector pHB100, yielding the plasmid pHB102. Genomic 
DNA of E. coli JM83 was isolated with Qiagen-tip 100G according to the 
manufacturer's protocol. The genomic DNA was partially digested with Sau3AI and 
applied to a 1 % agarose gel. The range of 1 kb to 6 kb length was cut out, and the 

genomic DNA eluted with GenElute™ agarose spin columns (Supelco), 
phenol/chloroform extracted and ethanol precipitated. After ligation of the E. coli 
library in the Bglll site of the polylinker of pHB102, the ligation mixture was 
precipitated with n-butanol and electroporated into E. coli XL1-Blue (Stratagene). 

2 

After plating on 2xYT in 530 cm dishes (Nunc) and overnight incubation at 37°C, 
the colonies were washed off the plates with 5 ml 2xYT, the OD550 was 
determined and the cells stored at -80°C after addition of glycerol to 10% final 
concentration. 

Phage panning. A 10 ml culture of 2xYT, containing 15 vgfvnl tetracycline (tet), 
containing 0.1 ml salt mixture (0.86 M NaCI, 0.25 M KCI, 1 M MgCI 2 ) was 
inoculated to an OD550 of 0.1 with E. coli harboring the genomic library. After 1 h 
incubation at 37°C chloramphenicol (cam) was added to a concentration of 30 
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1 2 

pg/ml, and the cells were grown to an OD550 of 0.5. Then 10 pfu of helper 
phage VCS M13 (Stratagene) was added and incubated for 15 min without 
agitation at 37 °C, followed by addition of 50 ml 2xYT medium containing 30 pg/ml 
cam, 15 pg/ml tet, 0.5 ml salt mixture, 0.1 mM isopropyl-p-D-thiogalactoside 
(IPTG), and then shaken for 2 h at 37°C. After addition of 30 ug/ml kanamycin 
(kan) the cultures were grown overnight at 37°C. The cells were harvested and the 
phagemid DNA isolated (QIAprep spin kit, Qiagen). The phages from the culture 
supernatant were precipitated by incubation for 30 min with 1/4 volume PEG/NaCI 
solution (17% PEG 6000, 3.3 M NaCI, 1 mM EDTA) on ice, and the pellets were 
redissolved in 2 ml PBS (8 mM Na2HP04, 1.8 mM KH2PO4, 137 mM NaCI, 3 mM 
KCI, pH 7.4 (Sambrook et al., 1989)). Immunotubes (Nunc) were coated with 20 
ug/ml fluorescein-isothiocyanate coupled to bovine serum albumin (FITC-BSA) in 
PBS overnight at 4°C and blocked with 5% skimmed milk in PBST (PBS containing 
0.05% Tween-20) for at least 1 h at room temperature. Five hundred pi of the 
phage solution was filled to a final volume of 5 ml with 2% skimmed milk in PBST 
and applied to the tubes for 2 h at room temperature. The tubes were washed 20 
times with PBST and 2 times with PBS. Bound phages were eluted with 1 ml 0.1 M 
glycine/HCI pH 2.2 for 10 min. The eluate was neutralized immediately with 60 pi 2 

M Tris and the phages (typically 10 4 -10 6 cfu) were used for re-infection. 

Phage purification and ELISA. Phage ELISAs were carried out to assay the 
amount of functionally displayed scFv on M13 phages. Single colonies were grown 
at 37°C overnight in 5 ml 2xYT medium containing 30 pg/ml cam and 15 pg/ml tet. 
Ten ml of 2xYT medium containing 30 ug/ml cam, 15 pg/ml tet, 0.4% glucose and 
0.1 ml salt mixture was inoculated with the overnight culture to give an OD550 of 

0.1. At an OD550 of 0.3 to 0.5, 10 12 cfu VCS helper phage (Stratagene) were 
added. After 15 min, 50 ml 2xYT medium containing 30 pg/ml cam, 15 pg/ml tet, 
0.5 ml salt mixture and 0.1 mM IPTG was added. After 2 h at 37°C, kan was added 
to a final concentration of 30 pg/ml and the cells were grown overnight. The 
phages were precipitated from the culture supernatant by incubating for 30 min 
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with 1/4 volume PEG/NaCI solution as above on ice, and the pellets were 
redissolved in 1 ml PBS. 

After addition of 1.6 g of CsCI, the volume was adjusted to 4 ml with PBS. The 
CsCI solution was transferred into a 1/2 x 11/2 in. polyallomer tube and 
centrifuged at 100,000 rpm for 4 hr in a TLN-100 rotor (Beckman Intruments) at 
4°C. After centrifugation the phage band was removed as described (Smith & 
Scott, 1990). The phages were transferred to 1/2 x 2 in. polycarbonate tubes, 
which were filled with PBS to 3 ml. After centrifugation at 50,000 rpm for 1 hr in a 
TLA-100.3 rotor at 4°C, the pelleted phages were redissolved in 3 ml PBS. After 
an additional centrifugation at 50,000 rpm for 1 hr in a TLA-100.3 rotor at 4°C, the 
phages were dissolved in PBS. The concentration of phage particles was 
quantified spectrophotometrically (Day, 1 969). 

ELISA plates were coated with FITC-BSA in PBS, for anti-levan antibodies with 10 
|jg/ml levan (polyfructose, Sigma) in PBS at 4°C overnight. The plates were 
blocked for 1 h at room temperature. A defined number of purified phages 
(measured by OD) (Day, 1969) were mixed with 2% skimmed milk in PBST in the 
absence or presence of 10 pM fluorescein or 0.05% levan and applied to the 
blocked ELISA plates and incubated for 1 h at room temperature. Detection was 
as above, using an anti-M13 antibody conjugated with horseradish peroxidase 
(Pharmacia). 

1 1 

Phage blots. For phage blots 10 phages were applied to a reducing 11% SDS- 
PAGE, and blotted on nitrocellulose membranes. Detection was carried out with 
the monoclonal antibody 10C3 (Tesar et al., 1995), which recognizes the C- 
terminal half of g3p (1:50.000 in TBST (25 mM Tris/ HCI pH 7.5, 150 mM NaCI, 
0.05 % Tween-20) containing 2% milk), for 60 min at RT, followed by incubation 
with a polyclonal anti-mouse-peroxidase conjugate (Pierce) (1:5000 in TBST/2% 
milk, 45 min RT), and using the ECL-kit (Amersham). 

Crude extract ELISA. Fifty ml of LB medium containing 30 pg/ml cam were 
inoculated with a single colony of £. coli JM83, harboring a plasmid encoding the 
respective scFv fragment. The cultures were grown at 24°C to an OD550 of 0.5 
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and induced with 1 mM IPTG. After overnight induction at 24°C the cells were 
harvested and resuspended in 4 ml PBS. Whole cell extracts were prepared by 
French Press lysis at 10,000 psi and 1 ml of the crude extract was centrifuged in 
an Eppendorf tube for 30 min at 50.000 rpm in a TLA-100.3 rotor (Beckman 
Instruments) at 4°C. After centrifugation the supernatants containing the soluble 
material were normalized to an OD550 of 20 in 1 ml. ELISA plates were coated 
and blocked as described above for phage ELISAs. A defined amount of crude 
extract was mixed with 2% skimmed milk in PBST in the absence or presence of 
10 pM fluorescein or 0.05% levan and applied to the blocked ELISA plates and 
incubated for 1 h at room temperature. The signal was detected with an anti-myc- 
tag antibody (Munro & Pelham, 1986) and an anti-mouse antibody conjugated with 
horseradish peroxidase and soluble BM blue POD-substrate (Boehringer- 
Mannheim), and after stopping the reaction with 0.1 M HCI, the signals were read 
at 405 nm. 

Protein purification. The anti-phosphorylcholine scFv McPC603-H11 (Knappik & 
Pluckthun, 1995) was purified using PC-Sepharose affinity chromatography 
(Skerra & Pluckthun, 1988) in the presence or absence of co-expressed Skp. The 
concentration and yield was estimated photometrically using a calculated 
extinction coefficient (Gill & von Hippel, 1989). 

1 .3. Identification of Skp 

We used a phagemid displaying the poorly folding scFv fragment of the anti- 
fluorescein antibody 4-4-20 (Bedzyk et al., 1990; Nieba et al., 1997) as the 
recipient for an E. coli genomic library. E. coli DNA was size-fractionated from 1 to 
6 kb and ligated into a polylinker placed between the colE1 origin of replication and 
the chloramphenicol (cam) resistance gene of a phagemid (Fig. 1B) developed for 
phage display (Krebber et al., 1997). Thus, E. coli genes, regulated under their 
own promoters, are over-expressed on the phagemid, primarily through an effect 

of vector copy number. A library size of 5x1 0 4 clones ensured that each piece of 
the E. coli genome should be represented, provided it led to viable clones. 
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Seven panning rounds on BSA-fluorescein were carried out (Fig. 1A), and after 
each round the phagemid DNA was cut with the restriction enzyme Notl to detect 
the accumulation of any inserts. It can be seen in Fig. 2 that a band of about 990 
bp accumulates throughout the panning. Four of 8 single colonies analyzed after 
the seventh round carried this insert, which was sequenced and identified to only 
contain the complete gene for the periplasmic protein Skp (Hoick & Kleppe, 1988), 
from 272 bases upstream of the start codon to 1 99 bases downstream of the stop 
codon (Fig. 3). Four bases after the stop codon of skp is the start codon of IpxD 
{firA), leading to a truncated peptide of the 65 N-terminal amino acids of this 
protein. One-hundred and twenty-five base pairs upstream of the start codon of the 
skp gene lies the stop codon of an open reading frame, which codes for an 810 aa 
protein with unknown function. A homology search showed 66.2% similarity to the 
protective surface antigen D15 of Haemophilus influenzae (Swiss-Prot: P46024), 
and surface proteins of Pasteurella multocida (TREMBL: Q51930) and Neisseria 
gonorrhoeae (TREMBL: P95359). 

1 .4. Effects of Skp co-expression on phage display 

To determine how and why Skp gets enriched, we first characterized the phages 
produced in the absence or presence of co-expressed Skp. For this purpose, we 
cloned a variety of different scFv fragments in the phagemid with and without skp. 
The phage titer was indistinguishable within experimental error, demonstrating that 
Skp is not selected because it would lead to the production of more phages. 
However, the antigen binding phage ELISA signal from the same number of 
purified phage particles is higher, proving that the Skp over-expression increases 
the number of functional antibody molecules on the phage (Fig. 4). This effect is 
seen with all four antibodies tested, albeit to different degrees. 

We then determined whether the total amount of fusion protein per phage is also 
increased by the over-expression of Skp. For this purpose we analyzed the 
amount of full-length fusion protein on purified phage particles in the presence or 
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absence of skp on the phagemid by Western blot, using the monoclonal antibody 
10C3 (Tesar et al., 1995) (Fig. 5). For the scFv 4-4-20, the co-expression of Skp 
dramatically increased the presence of fusion protein presented on the phage. 
Since the same amount of purified phages was loaded on the gel, Skp must 
facilitate the incorporation of functional fusion protein into the phage, which is also 
reflected by the antigen-binding ELISA (see above). It can be seen that Skp has 
this effect on all of the six antibodies tested, albeit to a different extent and to 
different final level of incorporation. Since the fusion protein is still only a minor 
species when compared to g3p wt (encoded by the helper phage), we cannot 
reliably determine a decrease of g3p wt, but most likely, the scFv-g3p fusion 
protein takes the place of g3p wt more often in the presence of over-expressed 
Skp. 

1.5. Effects of Skp co-expression on soluble protein expression 

We then determined the effect of Skp on the production of several of the scFv 
fragments in soluble form using the non-suppressor strain JM83. Using antigen- 
binding ELISA (Fig. 6) it can be seen that the amount of soluble scFv was 
dramatically increased in the presence of co-expressed Skp. To demonstrate that 
this is also reflected in the yield of purified protein, the scFv fragment of the anti- 
phosphorylcholine binding antibody McPC603-H1 1 (Knappik & Pluckthun, 1995) 
was tested. Co-expression of Skp increased the amount of protein, purified by 
affinity-chromatography on phosphorylcholine by about a factor of 4. 

The results of Fig. 4 and Fig. 5 suggest that the more an scFv fragment tends to 
aggregate in the periplasm of E. co//, the stronger is the influence of Skp in the 
phage ELISA. For the scFv fragment of FITC-E2 (Vaughan et al., 1996; Krebber et 
al., 1997a), which folds very well und shows little insoluble material when 
expressed in the periplasm, we observed a reduced influence of Skp in the phage 
ELISA, compared to the poorly folding scFv 4-4-20 (Nieba et al., 1997) and 
ABPC48-C(H22)S (Proba et al., 1997; Proba et al., 1998). The engineered Flu4D5 
(Jung & Pluckthun, 1997), with improved properties compared to 4-4-20, is 
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intermediate. This shows that the better an scFv is functionally expressed and the 
less aggregation-prone it is, the less is the influence of Skp, suggesting that Skp 
supports the correct folding of poorly expressed scFv fragments and its fusion 
proteins. 

Example 2: Identification and application of FkpA 
2.1. Experimental Protocols 

Construction of genomic library. The gel-purified Sfil fragment encoding the 
scFv fragment of the anti-levan antibody ABPC-C(H22)S (Proba et al., 1997; 
Proba et al., 1998) was ligated in the vector pHB100 (Bothmann and Pluckthun, 
1998), yielding the plasmid pHB121. Genomic DNA of E. coli RC354c (Chen and 
Henning, 1996) was isolated with Nucleobond AXG100 cartridge according to the 
manufacturer's protocol (Macherey-Nagel). The genomic DNA was partially 
digested with Sau3AI and applied to a 1% agarose gel. The range of 1 kb to 6 kb 
length was cut out, and the genomic DNA eluted with GenEluteTM agarose spin 
columns (Supelco), phenol/chloroform extracted and ethanol precipitated. After 
ligation of the E. coli library in the Bglll site of the polylinker of pHB121 , the ligation 
mixture was precipitated with n-butanol and electroporated into E. co// XL1 -Blue 
(Stratagene). After plating on 2xYT in 530 cm 2 dishes (Nunc) and overnight 
incubation at 37°C, the colonies were washed off the plates with 5 ml 2xYT, the 
OD550 was determined and the cells stored at -80°C after addition of glycerol to 
50% final concentration. 

Phage panning. Phage panning was done as described in Example 1. Instead 
using FITC-BSA, the immunotubes (Nunc) were coated with 10 pg/ml levan 
(polyfructose, Sigma) in PBS overnight at 4°C. 

Phage purification and ELISA. Phage purification and ELISA was done exactly 
as described in Example 1 . 
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2.2. Phage selecti n and identification of c xpress d fact r. 

We used a phagemid displaying the poorly folding scFv fragment of the anti-levan 
antibody ABPC48-C(H22)S as the recipient for an E. coli genomic library. E. coli 
RC354c-DNA (skp-deficient strain) was size-fractionated from 1 to 6 kb and ligated 
into the polylinker of plasmid pHB121. Thus, E. coli genes, regulated under their 
own promoters, are over-expressed on the phagemid, primarily through an effect 
of vector copy number. A library size of 6.4x1 0 5 clones ensured that each piece of 
the E. coli genome should be represented, provided it led to viable clones. 

Seven panning rounds on levan were carried out, and after each round the 
phagemid DNA was cut with the restriction enzyme Notl to detect the accumulation 
of any inserts. It can be seen in Fig. 7 that two bands of about 1 .7 kb and 2.0 kb 
accumulate throughout the panning. Fourteen of 17 single colonies analyzed after 
the seventh round carried the 1 .7 kb insert, 3 the 2.0 kb insert. Both inserts were 
sequenced and the 1.7 kb band contains an insert of 1629 bp length, the 2.0 kb 
band an insert of 1987 bp length (Fig. 8). Both inserts contain the same two 
complete genes coding for the periplasmic protein FkpA (Home and Young, 1995; 
Missiakas et al., 1996) and the protein SlyX with unknown function. Both inserts 
start 218 bp upstream of the stop codon of fkpA. Therefore both inserts code for 
the first 21 aa of the gene yheO, which codes for a protein with strong similarity to 
Haemophilus influenzae HI0575. The 1629 bp insert ends 159 bp downstream of 
the stop codon of slyX, whereas the 1987 bp insert ends 528 bp downstream of 
the stop codon of slyX. Both inserts code for the C-terminal part of SlyD (Roof et 
al., 1994; Roof et al., 1997; Hottenrott et al., 1997). The 1987 bp insert codes also 
for the first 117 aa of YheP, an protein with unknown function. To examine, which 
of the two complete genes is responsible for enrichment, fkpA and slyX were PCR- 
amplified and precisely recloned separately as well as in combination at the same 
position in the vector pHB102 (see Example 1). 
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2.3. Characterization of th influ nee of FkpA and SlyX on phage display. 

To determine how and why FkpA and SlyX get enriched, we characterized the 
phages produced in the absence and presence of co-expressed FkpA and SlyX. 
The antigen binding phage ELISA (Fig. 9) shows, that over-expression of FkpA 
increases the number of functional antibody molecules on the phage, compared to 
no over-expression or over-expression of Skp. In contrast, the over-expression of 
SlyX has no effect on the number of functional antibody molecules on the phage. 
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